FILTER MODE ACTIVE

#long-context LLM

Records found: 4

#long-context LLM29/09/2025

oLLM Lets 100K-Token LLMs Run on 8 GB Consumer GPUs by Offloading Memory to SSDs

'oLLM offloads model weights and KV cache to SSD so you can run very long-context LLMs on a single 8GB GPU, trading storage bandwidth for massive context windows.'

READ →

#long-context LLM18/09/2025

Alibaba Open-Sources Tongyi DeepResearch: 30B MoE LLM Built for Long-Horizon Web Research

'Alibaba open-sourced Tongyi DeepResearch-30B-A3B, a 30B-parameter MoE LLM tailored for long-horizon, tool-augmented web research with 128K context and strong benchmark scores.'

READ →

#long-context LLM06/09/2025

Alibaba Unveils Qwen3-Max-Preview — A Trillion-Parameter LLM with 262K Token Context

'Qwen3-Max-Preview is Alibaba's first trillion-parameter LLM, offering a 262K-token context window and competitive benchmark performance while remaining API-only with tiered pricing.'

READ →

#long-context LLM19/08/2025

Nemotron Nano 2: 128K-Context LLMs That Run Up to 6× Faster on a Single A10G

'NVIDIA's Nemotron Nano 2 delivers hybrid Mamba-Transformer LLMs that run up to 6× faster and support 128K-token context on a single A10G GPU, with most training data and recipes open-sourced.'

READ →